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Abstract — In the first chapter of Shannon's A Mathematical 
Theory of Communication, it is shown that the maximum entropy 
rate of an input process of a constrained system is limited by the 
combinatorial capacity of the system. Shannon considers systems 
where the constraints define regular languages and uses results 
from matrix theory in his derivations. In this work, the regularity 
constraint is dropped. Using generating functions, it is shown that 
the maximum entropy rate of an input process is upper-bounded 
by the combinatorial capacity in general. The presented results 
also allow for a new approach to systems with regular constraints. 
As an example, the results are applied to binary sequences that 
fulfill the (j, k) run-length constraint and by using the proposed 
framework, a simple formula for the combinatorial capacity is 
given and a maxentropic input process is defined. 

I. Introduction 

This work is motivated by the recent interest in the 
information-theoretic limits of systems with constraints that 
do not form a regular language. One example is the consider- 
ation of context-free languages with an application to genetic 
sequence modelling [1], another example is the investigation 
of asynchronous channels [2]. 

A constrained system allows the noiseless transmission 
of input sequences of weighted symbols that fulfill certain 
constraints on the symbol constellations. A natural question is 
how to efficiently encode a random source such that it becomes 
a valid input for a constrained system [3]. Furthermore, it is 
of interest to determine the ultimate performance of such an 
encoder, which is closely related to the entropy rate of random 
processes that generate strings that fulfill the constraints. 
The maximum entropy rate of all such processes is equal 
to the combinatorial capacity of the considered system in 
the case that the constraints form a regular language. This 
was originally shown in [4]. In [5], the authors show this 
property for a slightly generalized setup, since they allow 
non-integer valued symbol weights, as long as the set of 
weights is not too dense. We will define what "not too dense" 
means in Section [III] Recently, the authors of [2] showed 
that combinatorial capacity and maximum entropy rate are 
equal for a specific class of constrained systems, which they 
call the asynchronous channel. It is worthy to note that for 
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asynchronous channels, the "not too dense" assumption is not 
necessarily fulfilled. 

In this work, we consider constrained systems with not 
necessarily regular constraints; we allow symbol weights 
taking arbitrary positive real values, and motivated by [2], 
we allow the set of symbol weights to be too dense. For 
this general class of constrained systems, we show how to 
represent such systems by generating functions. We give a 
new definition of combinatorial capacity that coincides with 
the original definition [4] when the weight set is not too dense. 
By invoking known results, we show that the combinatorial 
capacity of such systems is equal to the abscissa of con- 
vergence of the corresponding generating function. Finally, 
we define input processes of constrained systems and show 
that the maximum entropy rate of input processes is upper- 
bounded by the abscissa of convergence of the generating 
function. This is our main result: independent from if the 
"not too dense" property is fulfilled or not, and for whatever 
constraints, the combinatorial capacity is equal to the abscissa 
of convergence of the generating function, and the entropy 
rate is upper-bounded by the abscissa of convergence of the 
generating function. By a detailed discussion of the (j, k) run- 
length constraint, we illustrate our ideas and we show that our 
framework, besides being more general, also allows for a new 
approach to investigate regular systems. Namely, we derive 
a simple formula for the combinatorial capacity of the (j, k) 
constraint and we define an input process whose entropy rate 
is equal to the combinatorial capacity. 

The remainder of this paper is organized as follows. In 
Section [Til we define constrained systems and generating 
functions. In Section [III] we show how to calculate the 
combinatorial capacity. We then show in Section [IV] how the 
combinatorial capacity relates to entropy rates and finally, in 
Section [V] we show that the entropy rate of input processes 
is upper-bounded by the combinatorial capacity. 

II. Constrained Systems 

In this section, we define the class of constrained systems 
that we will investigate in this work and we show how to 
represent them by generating functions. 

Definition 1. A constrained system A = (A, w) consists of 
a countable set A of strings accepted by the system and an 



A system Su^) = (A, w) accepting binary strings that fulfill the (j, k) constraint can be defined as follows using regular 
expressions [6]: 

A=(1U-- •Ul---1)[(0 U ■ ■ • UOj— 0)(1 U • • ■ U (e U U • ■ ■ U0_^_0) 

j times k times j times k times 

U (0 U ■ ■ • U0--0)[(1 U ■ ■ • U 1---1)(0 U ■ • • U0_^_0)]*(e U 1 U • ■ ■ UL^_1) (E.l) 

k times j times A; times j times 

w(0) = w(l) = 1. (E.2) 

The symbol * denotes the Kleene star, U denotes the or-operation, e denotes the empty-string with w(e) = 0, 11 denotes 
"1 concatenated with 1". Note that concatenation is non-commutative: 1(1 U 0) = 11 U 10 ^ 11 U 01. 



Example 1. Definition of S { 



From ( IE. Il l and ( lE.2b in Example Q] the generating function of Su ; t k) can directly be derived as 

oo 

Gu, k) (s) = (e- s + ■■■ + e^ s ) ^[(e- s + • • • + e - ks )( e - s + ■■■ + e^ s )]«(l + e~ s + 



+ e~ fcs ) 

+ (e- s + ■■■ + e- ks )J2[(e~ s + •'• + e^ s ){e- s + ■■■ + e - fes )]"(l + e" 8 + • • • + e^ s ). (E.3) 

n=0 

A discussion of how to derive generating functions from regular expressions can be found in [7]. 



Example 2. Generating function of Sy^y 



associated weight function w: A — » M>o (R>o denotes the 
positive real numbers) with the following property: if a, b 6 A 
and ab 6 A then w(ab) = w(a.) + w(b). 

The weight of a symbol can have different practical mean- 
ing. In the context of magnetic recording systems, "weight" 
will probably refer to "tape-length"; other meanings like 
"time" or "energy" are possible, depending on the modelled 
system. For an illustration of our definition, we define in 
Example Q] the system Su t k), which accepts binary sequences 
that consist of at most j consecutive Is and at most k 
consecutive 0s. This constrained is called the (j, k) run-length 
constraint. We return to Sum at several points in our paper. 

A. Generating Functions 

To analyze the asymptotic behavior of constrained systems, 
we first represent the set A of allowed strings together with the 
weight function w by a generating function. We then interpret 
the generating function as a function on the complex plane and 
investigate its convergence behavior. This approach, mostly 
referred to as analytic combinatorics, is discussed in detail in 
[8]. We consider a more general case since we do not restrict 
the range of the weight function w to the natural numbers, 
but allow for the set of positive real numbers K>o. Therefore, 
we use general Dirichlet series [9] instead of power series as 
generating functions. 

Definition 2. Let A = (A, w) represent a constrained system. 
We define the generating function of A by 



G A (s) 



S G 



(1) 



where C denotes the set of complex numbers. 

Let ft denote the set of distinct string weights of elements in 
A. We order and index the set ft such that f2 = {^fc}^Li with 
v\ < V2 < ■ ■ ■ ■ For every £ il, N(vk) denotes the number 
of distinct strings of weight in A. We can now write the 
generating function as 



G A (s) =Y,N(v k )e- 

k=l 



(2) 



Since the coefficients N(vk) result from an enumeration, they 
are all non-negative. In Example [2] we show how to represent 
S{j,k) by a generating function. 

III. Combinatorial Capacity 

In previous works that consider non-integer valued weights 
[5], [10], the authors restrict themselves to constrained systems 
where the ordered set of string weights O = {vk}^ =1 is not 
too dense, that is, there exists some constant L > and some 
constant K > such that for any integer n > 



max k < Ln K . 

v k <n 



(3) 



Under the "not too dense" assumption, it is meaningful to 
follow Shannon's original definition and to identify the com- 
binatorial capacity with Co given by 



Co = limsup 

k — >OG 



IniVK) 



Vk 



(4) 



as was done for instance in [5] and [10]. Here and hereafter, 
In denotes the natural logarithm. Throughout the paper, for 



a sequence {sz}^, s = limsup^^ s; is equivalent to the 
following: for any e > 0, it holds that 



si < s + e almost everywhere (a.e.) (5) 
and 

si > s ~ e infinitely often (i.o.) (6) 

with respect to / e N (N denotes the set of natural numbers 
starting with one). 

If fi is too dense, the number of possible string weights in 
the interval [n, n + 1] increases faster than polynomial with 
n, in which case identifying combinatorial capacity with Co 
may become inappropriate. See [10] for an example and see 
[5] for a detailed discussion. 

We now give a definition of combinatorial capacity that is 
meaningful also when the ordered set of string weights f2 is 
too dense. 

Definition 3. We define the combinatorial capacity C as 



C = lim sup - 

k — >oo 



1=1 



Vk 



(7) 



The motivation for our generalized definition is twofold. 
First, our results on entropy rates for constrained systems, 
which we will present in the remaining sections, do not depend 
on the "not too dense" property. Second, recent work [2] 
has shown that there are constrained systems of practical 
interest that may not necessarily have the "not too dense" 
property. The following theorem shows that Definition [3] of 
combinatorial capacity is consistent with the conventional 
definition @, namely, if Q, is not too dense, then our definition 
of combinatorial capacity coincides with (@J, i.e., C = Co- 

Theorem 1. Let A = (A, w) be a constrained system with 
the set of distinct string weights Q and generating function 
Ga{s). Denote the abscissa of convergence of Ga by Q. The 
following holds: 

(i) The combinatorial capacity C is equal to Q, i.e., 



lim sup 

k — >oc 



i=i 

Vk 



Q. 



(8) 



(ii) If the set of distinct weights Q is not too dense, then 
Co = C, i.e., 



lim sup 

k — >oo 



\nN{v k ) 



Vk 



lim sup 

k — >oo 



hi[£ N(ut)] 
i=i 

Vk 



(9) 



Proof: We start by proving statement (i). All coefficients 
N(vk) are non-negative since they result from an enumeration. 
Therefore, for all k 6 N, N{v k ) = \N(u k )\. With this 
observation, (i) follows directly from [9, Theorem 7]. To proof 
statement (ii), we assume that O is not too dense. In this 
case Co = Q, which was shown in [10, Lemma 1]. Statement 
(ii) now follows from statement (i). Alternatively, (ii) can be 
shown by combinatorial arguments, see [7, Appendix B.3] ■ 



By Theorem Q] the combinatorial capacity of S(j tk ) is 
given by the abscissa of convergence of its generating 
function Gy^y From (IE. 3b in Example [T] we see that the 
abscissa of convergence of G(j m is given by the largest 
positive real solution of 



>- JS )( e - s 



) = 1- 



(E.4) 



This formula coincides with the formula given in [11, 
Theorem 2] and it can also be derived by applying the 
techniques introduced in [12]. 



Example 3. Combinatorial capacity of S, 



Returning to our example, the set of symbol weights of 
5Vj,fc) is not too dense, because the underlying alphabet {0, 1} 
of A is finite. See [5, Appendix A] for a more detailed 
discussion of this argument. Using Theorem Q] we derive in 
Example [3] a simple formula for the combinatorial capacity of 
%fe). 

IV. The Relation Between Combinatorial Capacity 
and Entropy Rates 

After having defined the combinatorial capacity of con- 
strained systems in the last section, we now want to con- 
sider random processes that generate strings that fulfill the 
constraints of the considered system. We then want to know 
how the maximum entropy rate of such a process relates to 
the combinatorial capacity. Ultimately, we have a process in 
mind that generates at each time instant a substring, which is 
then appended to the string that has been generated so far, 
such that at each time instant, the generated string fulfills 
the constraints of the system. In magnetic recording, such 
a process would generate a substring, write it to the tape, 
generate another substring, write it to the tape, and so forth, 
without ever rewinding the tape. The difficulty of analyzing 
such a process is the following: fix two time instants I and 
I', I < I'. The probability that the process writes a specific 
string to the tape until time instant I' depends in general on 
the probabilities of the strings that it can write until time 
instant I, This dependency can become arbitrarily complicated 
depending on the constraints of the system. Because of these 
interdependencies, it is difficult to bound the entropy rate of 
such a process. We solve these interdependencies by decou- 
pling the time instants / and /': each time the recording system 
wants to write to the tape, it first rewinds the tape completely 
and then overwrites everything that has been written before. 
We call such a system an input source (in contrast to an 
input process) of a constrained system. In this section, we 
show through a series of results that the entropy rate of input 
sources is upper-bounded by the combinatorial capacity and 
we postpone input processes until Section [V] 

A. Input Sources for Constrained Systems 

Definition 4. Let A = (A, w) denote a constrained system. 
Denote by X = {Xi}'j^ 1 a sequence of random variables and 



The sequence of random variables {Xi}?^.-, with support 
of Xi given by 



X, = [(0U---U0---0)(1U 



U 1 



(E.5) 



k times 



j times 



is an input source of A: first, [J^Z 1 Xi C A and XitlX^ = 
whenever I ^ k and second, X ^ for each I £ N, 
which shows that both condition (i) and condition (ii) of 
Definition |4] are fulfilled. 



Example 4. An input source for Sy^y 

denote by Xi the support of X{. We say that X is an input 
source of A if and only if 

(i) U~i XiCA and X x n X k = 0, if I ^ fc. 

(ii) For each Z e N, Af; ^ 0. 

We define in Example [4] an input source for Syfe). Note 
that the given example is not the only possible input source 
of %fc). 

We denote the probability mass function (PMF) of Xi by 

p Xl (x)=P[X l =x], xeX. (10) 

Definition 5. We define the entropy rate H of an input source 
X by 



H(X) = limsup- 



(ID 



where E[to(Xj)] denotes the average weight of all x S X\ with 
respect to the PMF px, and where HpQ) denotes the entropy 
of X; in nats. 

We can upper-bound the entropy rate H(X) by maximizing 
each term of the sequence on the right-hand side of (fTTT i 
separately. To do so, we need the following lemma: 

Lemma 1. Denote by pz the PMF of some random variable 
Z with countable support Z and an associated positive weight 
function w. The maximum entropy per average weight 

R(Z) 



i? z = max- 

pz h[w{Z)\ 



(12) 



is given by the greatest positive real solution of the equation 



(13) 



In addition, the PMF of Z that achieves this rate is uniquely 
given by 

-w(z)R z 



qz{z) 



z e .z. 



(14) 



Proof: These two properties of Rz were derived by using 
Lagrange Multipliers in [13] and they were independently 
derived in [14] by using the bound lnz < z — 1. We offer an 
alternative proof by applying the information inequality [15], 
which states for the Kullback Leibler Distance -D( || ) of two 
PMFs p and q that 



D(p\\q) > 



(15) 



with equality if and only if p = q. We thus have 

> -D{p z \\qz) 



Pz(z) 
R(Z)-R z E[w(Z)} 



which implies 



R(Z) 



< Rz 



(16) 
(17) 

(18) 
(19) 



E[w(Z)] 

with equality if and only if pz = qz- ■ 

Lemma 2. Let X denote an input source of some constrained 
system. Let the rate bound Rx be defined as 



R 



x 



lim sup Rx, 

l — >oo 



where each Rx, is chosen according to Lemma\l\ 
rate H(X) of X is is then upper-bounded by Rx 

Proof: We have 

H(X) = limsup ^ 
E[w(Xi)\ 

< hm sup max 

l^oo Px, E[w{Xi)\ 

= lim sup Rx, 

l — >oo 

= Rx. 



(20) 
The entropy 

(21) 

(22) 
(23) 
(24) 



Lemma 3. Let A = (A, w) represent a constrained system 
and let X denote an input source of A. The rate bound Rx 
of X is then upper-bounded by the abscissa of convergence 
Q of Ga, i-e., 



Rx<Q- 



(25) 



Proof: To proof the lemma, we show that the generating 
function Ga(s) diverges whenever Re(s) < Rx, for any input 
source X. 

The definition of the rate bound Rx in (l20l implies in 
particular that for any e > 



Rx, > Rx - e i o. 



(26) 



According to ( fT3l , Rx, is given by the greatest positive real 
solution of 



(27) 



xex, 



which implies further 

e - w ( x )[R x -e] > e -w(x)Rx t = l i o (28) 

xex, xex. 

Because of U;=i Xi C A according to Definition |U we can 
bound the generating function by 



G^) = £e-^>£mX£ 



} —w(x)s 



(29) 



= 1 xGX, 



Because of 



we have 



EE 

1=1 xeXi 



-w(x)[R x -(] n^oo 



(30) 



and we conclude that for any e > 0, G^ diverges in s = 
Rx — e. Thus, by [9, Theorem 3], Ga diverges for all s G C 
with Re(s) < Rx- Since by definition of Q, Ga converges 
for Re(s) > Q, it must hold that R x < Q. ■ 

V. Maxentropic Input Processes 

We now come to the main concern of this work: we want to 
define input processes for constrained systems and we want to 
investigate how the entropy rate of an input process is related 
to the combinatorial capacity. Loosely speaking, we want to 
define a random process that generates at each time instant a 
substring, which is then appended to the string that has been 
generated so far. At each time instant, the complete string 
should fulfill the constraints of the considered system. The 
notion of an input process differs fundamentally from what 
we defined as an input source: an input source generates a 
complete new string at each time instant. Before we give our 
definition of input processes, we motivate our definition by the 
following example. Consider a constrained system Shm that 
accepts any binary sequence, and assume w(l) — w(0) = 1. 
The combinatorial capacity is C = ln(2) ps 0.6932. Denote 
by V — {V/}^ the random process where the Vi take values 
in {0, 1, 01} and are independent, identically distributed (IID) 
according to the PMF py, which we define as follows: 



pv(0) = = e" 



Pv(0l) = e 



-2R 



(31) 



where R is given by the largest positive real solution of 

2e- s + e~ 2s = l. (32) 



Obviously, V generates binary strings that are accepted by 
Shin- We are interested in the entropy rate of V and calculate 



mo 

E[w(V)] 



= R 

« 0.8814 
> C 



(33) 

(34) 
(35) 



where equality in ( |33l follows from Lemma Q] Surprisingly, 
the entropy rate of V seems to exceed the combinatorial 
capacity of Sbm- The reason for this is that we implicitly 
assume in our attempt to calculate the entropy rate of V that, 
for example, the realizations v\ = 01 and (fi,f2) = (0,1) 
are distinguishable, however, they are not: both result in the 
string 01, so we are counting this string twice. To avoid this 
pitfall, we define input processes as follows. 

Definition 6. The random process {Yj}?^, Yi £ y is an input 
process of the constrained system A = (A, w) if the sequence 
of random variables Xi = cat(Yi, . . . , Yj) with the supports 
truncated to 



Xi = {cat(yi, . . .,yi)\(yi, ...,yi) 6 y l , Pr(yi 



,w)>o} 

(36) 



is an input source of A. The operator cat denotes concatena- 
tion: cat(a, b) = ab. 



We refer to the assignment d361 > in the following by trun- 
cated support. 

The process V as we defined it earlier is not an input 
process of Sbin^ define Xi = cat(Vi, . . . , V{), I = 1,2,.... 
The random variables X\ and Xi have the following truncated 
supports: 

#1 = {0,1, 01} (37) 
X 2 = {00, 01, 001, 10, 11, 101, 010, 011, 0101}. (38) 

As we can see, X\ n<#2 = 01 7^ 0, so X is not an input source 
of Sbin and thus, by definition, V is not an input process of 
S b in- By changing the PMF of the Vi to 

Pv (0)=p v {l) = -, p v (01) = Q (39) 

each random variable Xi has the truncated support 

A5 = (0U1)'. (40) 

Thus, with the probability assignment d39l . X\ n Xk = 
whenever k 5^ I, and consequently, X is an input source and 
Y is an input process. 

Now that we have defined input processes in a way that does 
not allow for "counting twice", we can give the following 

Definition 7. The entropy rate of an input process Y is defined 
as 

fi(Y) - limsup ^ r ?}Y 1, ''' ,Yl \ v „ . (41) 

i-.oo E[io(Yi) -I h w(Yi)] 

The entropy rate of an input process of a constrained system 
relates to the combinatorial capacity as follows. 

Theorem 2. Let A = (A, w) represent a constrained system. 
The entropy rate of an input process Y of A is upper-bounded 
by the abscissa of convergence Q of Ga, and in particular, it 
is upper bounded by the combinatorial capacity C of A. 

Proof: Since Y is an input process of A, by definition, 
the sequence X of random variables Xi = (Yi, . . . ,Yi) with 
truncated supports is an input source of A. We thus have 

H(Yi, . . . ,Yi) 



H(Y) = limsup ■ r 
v ' E[iu(cat(Yi, 

= lim SUP — : : — 

< Rx 
<Q 
= C 



• Y i))} 



(42) 

(43) 

(44) 
(45) 
(46) 



where the equality in d43b follows from the definition of X, 
the inequality in d44l follows from Lemma |2] the inequality in 
d45l) follows from Lemma [5J and the equality in d46l ) follows 
from Theorem Q] ■ 
We call an input process Y* of a constrained system A 
maxentropic if for any input process Y of A we have H(Y*) > 



Let Y = {Yj denote a random process with realiza- 
tions (yx, . . . ,yi) e y l . Let y be given by 

y=(OU-"UO---0)(lU--Ul-"l). (E.6) 

k times j times 

From Example 2J we know that the sequence of random 
variables {Xi}'^ 1 with Xi = cat(Yi, . . . ,Yi) is an input 
source of Stj^y, therefore, Y is an input process of Sy *). 
Let {Yj}^ be independent, identically distributed (IID) 
according to the PMF 

p Y {y)=e- w ^ R , yey (E.7) 

where R is given by the largest positive real solution of 

V e -w{v) S = ( e -s + . . . + e - js )(e- s + ■■■ + er ks ) 



yey 



1. (E.8) 



Comparing dE.8t with iEAi in Example [3] we see that 
R = C, i.e., R is equal to the combinatorial capacity C 
of S/j^k)- The entropy rate of Y is 

H(Y) = lim sup ^ r ^""'^L^ (E.9) 

= lim sup .l?^ (E.10) 
= R = C (E.ll) 

where the second equality follows from the independence 
bound on entropy [15] together with the fact that the Yj 
are IID, and the linearity of w. Since, from Theorem [2] 
H(Y) < C, we conclude that Y is a maxentropic input 
process of Su^. 



Example 5. A maxentropic input process of SV^fc) 

H(Y). Because of Theorem |2] H(Y*) = C, i.e., the entropy 
rate of Y* is equal to the combinatorial capacity of A, is a 
sufficient condition for Y* to be maxentropic. It is important 
to note that Theorem|2]does not claim that H(Y*) = C for any 
constrained system A. For a large class of constrained systems, 
however, H(Y*) = C. For this class, Theorem|2]is quite useful: 
assume that we want to show that system A belongs to this 
class. Without Theorem |2] we have to maximize the entropy 
rate over all input processes of A. Once we have determined 
the maximum entropy rate, we compare it to the combinatorial 
capacity and find that both are equal. An example of this 
approach can be found in the proof of [5, Theorem 5.1]. With 
Theorem [2] we can do something different: we look for an 
input process whose entropy rate is equal to the combinatorial 
capacity. Once we have found such an input process, we invoke 



Theorem |2] and are done. We illustrate this new approach in 
Example [5] 

VI. Conclusions 

In this work, we showed for a general class of constrained 
systems (including those with non-regular constraints and 
dropping the "not too dense" assumption for the weight set) 
that the maximum entropy rate of input processes is upper- 
bounded by the combinatorial capacity of the considered 
system. This general result allows for a new approach to show 
that maximum entropy rate and combinatorial capacity are 
equal: with our result, it is enough to find "some" input process 
whose entropy rate is equal to the combinatorial capacity of 
the considered system. Equality of maximum entropy rate and 
combinatorial capacity then follows from our result. In contrast 
to previous works (except for some works that consider 
specific classes of constraint systems), we do not use any result 
from matrix theory in our derivations. Our framework, which 
is based on generating functions, therefore allows, besides 
being more general, for a new approach to investigate regular 
systems. We illustrated this by applying our results to the (j, k) 
constraint. 
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